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Abstract 



Random graphs are matrices with independent — 1 elements with probabilities de- 
termined by a small number of parameters. One of the oldest model is the Rasch model 
where the odds are ratios of positive numbers scaling the rows and columns. Later Persi 
Diaconis with his coworkers rediscovered the model for symmetric matrices and called the 
model beta. Here we give goodnes-of-fit tests for the model and extend the model to a 
version of the block model introduced by Holland, Laskey, and Leinhard. 

1 Introduction 

Let n be a positive integer, 1 < i,j ' < n, and independent random variables such that 

e(i,j) = e(j,i) and e(i,i) = 0, furthermore 



where the sum of the pi-s is zero. The least square estimate p of p is the average of the epsilons, 
and the least square estimate of pi is the average of the differences s(i,j) —p. The modification 
of the model for non-symmetric matrices is straightforward, and in that case the statistical 
inference is practically a two-way analysis of variance. Perhaps this is the simplest random 
graph model but it shares the inconvenient property of many other random graph models that 
it is hard to ensure that edge probabilities remain in the interval (0, 1). If we use the odds 
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1 < i < j < n, 
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instead of the probabilities, then it is enough to ensure the positivity of rij-s. This is the case 
in the model introduced by George Rasch [31]. Historically the odds were defined as the ratios 
of scaling factors for rows and columns but we prefer the multiplicative form 

n,j = Pilj (3) 

for non-symmetric and 

r i:j = PiPj (4) 

for symmetric case. Statistical investigation of the model started with Andersen [I] (see also 
|211 1301 133] ) and later Persi Diaconis with his coworkers rediscovered the model and introduced 
the name beta-model for its parameter. The model has many attractive properties (see in 

- degree sequences are sufficient statistics 

- the model covers practically all possible expected degree sequence 

- the conditional distribution of the graphs on condition of a prescribed degree sequence is 
uniform on the set of all graphs with the given degree sequences. 

Statistically inference emerged from Gaussian distribution and later was extended to random 
variables in Euclidean spaces but the statistical inference on discrete structures is rather sparse 
( [H EE EES Uni [26] ) . Mathematical investigation of graphs has its own history. Nowadays instead 
of graphs we are speaking of networks ([27]) where the most investigated model is the stochastic 
block model introduced by Holland, Laskey, and Leinhard ([H]). Here the vertices are labeled 
by small numbers or colors and edge probabilities depend only on the labels (j3[ [17]). With 
an eye on preferential attachment where degree sequences follow scale-free power-law the block 
model was criticized because it has moderated flexibility on degree sequences. Chung, Lu, and 
Vu [14] introduced a model with independent vertices, Chaughuri, Chung, and Tsiatas ([TO]) 
introduced the planted partition model (see also [25J). Karrer and Newman [20] proposed and 
other extension of the block model. A natural extension of these models is the unification of 
the beta and block models: 

= b(i,c(j))b(j,c(i)), (5) 

where &(., .) is a positive matrix with n rows and k columns, and c(i) is the label of the z-th 
vertex i.e. it is an integer between 1 and k. We call the model k-beta model. The estimation 
of the labels in block models is possible by the spectral method ([32]). It is generally believed 
that eigenvectors and eigenvalues of the matrix e(i,j) tells everything of the structure of the 
graph ( [TOl [T2l [T3l 1221 [23l [24]). while there are many attempts to provide more flexible models 
(0129]). 
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2 Goodness-of-fit 



We can not test edge-independence on a single graph. While i.i.d. sample is common in 
statistical inference, in case of graphs the sample generally means a copy of a graph. Perhaps 
the number one question in statistical inference is the following. Let 

Pl,---,Pn (6) 

be an arbitrary given sequence of probabilities, and 

£1, . . . , £„ (7) 

be independent — 1 variables such that P(Ei = 1) = Pi- Can we test the model? A randomized 
answer is the following. Let 

Ui,...,u n (8) 
independent and uniformly distributed in (0, 1). Then 

Xi =PiUi£i + (1 + (1 -Pi)ui), i = l,...,n (9) 

are independent and uniformly distributed in (0, 1), what we can test. An other, more practical 
solution is ordering the the pairs (pi,£i) according to the Pi-s in increasing order and compare 
their partial sums. Or we can clump them into blocks of small number and compare again 
the sums. All these possibilities hold for graphs with estimated edge probabilities. Let us 
partition the edges of the complete graph according to the blocks formed with respect to the 
edge probabilities. In each portion the edge probabilities are close to each other whence the 
£%,j-s corresponding to that portion behave like a pure random graph, what we again can test 
e.g. by their sums on subsets of vertices. 

Blitzstein and Diaconis ([HI [H]) propose for testing the beta model the following general 
procedure. Let us choose any graph statistic and determine it on our graph. Let us generate as 
many graph we can with the same degree sequence as the investigated graph has according to 
the uniform distribution, and let us calculate the chosen statistics. If the value of the sample 
graph is inside the generated numbers, we accept the beta model, otherwise reject it. One can 
ask, are there any effect of the choose on the power of the test? 

We have found by computer simulations that graphs generated by beta model have only 
one eigenvalue proportional with n, all the others are of order y/n. We think that it is a 
characteristic property of beta graphs. One wonders that 

- if beta model covers all possible degree sequences 

- the conditional distribution is uniform over graphs sharing the same degree sequence, 
then how is possible that graph behaves differently from typical graphs generated by beta 
model? Of course there are graphs having many large eigenvalues. But where are they coming 
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from once beta model can generate all the graphs? A possible solution of the catch is the 
following. 

Let us generate a meta graph from graphs sharing the same degree sequence. Let us say that 
neighborhood in this meta graph is given by on single swap. If we have four vertices A, B, C, D 
in a graph such that AC, BD is and edge but AD, BC is not, then changing existence into non 
existence among these edges we form a new graph with the same degree sequence. The degree of 
a graph in this meta graph goes parallel with the second largest eigenvalue: typical beta model 
graphs have minimal degree and any increase in their degree results in a more complicated 
eigenvalue structure. Perhaps the degree in the meta graph is the most characteristic statistic 
for beta model. 



3 The k-beta model 

The maximum likelihood equations for the parameters b(., .) in (5) say that the expected values 
of degrees inside all the subgraph with a given pair of labels should be the same us in the 
given graph. This is the case when the labels are known. With unknown labels we can form a 
two-level optimization: for each label set first to determine the parameters b(., .) next changing 
a small number of labels and repeat the calculation of the parameters. But the procedure is 
slow even for graphs of moderate sizes. Spectral methods available for block models fail for 
coloring k-beta models because the model lose the well pronounced checkerboard character of 
block models. It is the ANOVA what offers an applicable algorithm. For any set C of labels 
c(.) let us calculate the statistic 

n i—1 

wo = E 5>(*> j) - «( c (o> - cu)) - «o\ cm 2 , (io) 



i=2 j=l 



where 



and 



«(M) = ?sa=^H£&£), (a) 

l^c(i)=s 2^c(j)=t L 

v{i,t) = ^ = . (12) 

2^c(j)=t 1 

Q(C) is the sum of two way ANOVA sum of squares calculated independently for subgraphs 
defined for pairs of labels. Starting from a uniform random set C of labels on the vertices and 
perturbing small number of labels in the individual steps a simple greedy optimization results 
in a good set of labels, which is close to the original (true) labels. 
For evaluating the character of a random graph we use the number 

, Y^i=2Y, l J ^i(p( i J) lo SP(i,j) + ('i--p(iJ))^og{l-p(i,j)) 

exp( M^Ty2 } (13) 
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We call it delogarithmed average entropy or DAE. This is a number between 1 and 2. If it is close 
to one the graph is almost deterministic: the probabilities are close to or 1. In checkerboard 
block models it means that empty and full subgraphs are amalgamated together. If DAE is 
close to 2 then the graph has no structure at al. DAE depends on edge density, too. The 
above tendency is valid for edge density |, for other edge densities the cut point is closer to 1. 
According to our experience if DAE is smaller then 1.9 while edge density is half, then we are 
able to reconstruct the original labels. For these graphs the number of non-trivial eigenvalues 
is 2k — 1, thus the spectrum determines the number of different labels. 
The k-beta model has a sister model 

k 

r hJ =J2Khs)b(j,s) (14) 

s=l 

what we call small odds rank model. Strictly speaking we ought to redefine the diagonal of 
odds matrix, but perhaps the name is permissible without doing so. The maximum likelihood 
estimation of parameters in small odds rank models is straightforward and the block structure 
is detectable in the estimated parameters. Actually the block model is in the intersection of 
k-beta and small odds rank models, thus if there is any block structure in the graph it is 
detectable even in fitting k-beta model to the graph. But if there is no block structure and we 
are trying to use ANOVA coloring for a small odds rank graph then the algorithm is no longer 
stable, it results in different local minima in each runs. 
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